Predicting accounting fraud using imbalanced ensemble learning classifiers – evidence from China
نویسندگان
چکیده
Abstract The current research aims to launch effective accounting fraud detection models using imbalanced ensemble learning algorithms for China A‐Share listed firms. Based on a sample of 33,544 Chinese firm‐year instances from 1998 2017, this respectively established one logistic regression and four classifiers (AdaBoost, XGBoost, CUSBoost, RUSBoost) by 12 financial ratios 28 raw data. Additionally, we divided the into train test observations evaluate classifiers' out‐of‐sample performance. In detail, applied two metrics, namely, Area under ROC (receiver operating characteristic) curve (AUC) Precision‐Recall (AUPR), discriminability. supplement test, study put forward an algebraic fused model basis introduced sliding window technique. empirical results showed that can detect A‐listed firms far more effectively than model. Moreover, (CUSBoost performed better common (AdaBoost XGBoost) in average. also obtained highest average AUC AUPR among all employed algorithms. Our offer firm support potential role Machine Learning (ML)‐based Artificial Intelligence (AI) approaches reliably predicting with high accuracy. Similarly, settings, our ML‐based AI offers utmost advantage forecasting fraud. Finally, paper fills gap applications
منابع مشابه
Learning Classifiers from Imbalanced, Only Positive and Unlabeled Data Sets
In this report, I presented my results to the tasks of 2008 UC San Diego Data Mining Contest. This contest consists of two classification tasks based on data from scientific experiment. The first task is a binary classification task which is to maximize accuracy of classification on an evenly-distributed test data set, given a fully labeled imbalanced training data set. The second task is also ...
متن کاملDiversified Ensemble Classifiers for Highly Imbalanced Data Learning and their Application in Bioinformatics
In this dissertation, the problem of learning from highly imbalanced data is studied. Imbalance data learning is of great importance and challenge in many real applications. Dealing with a minority class normally needs new concepts, observations and solutions in order to fully understand the underlying complicated models. We try to systematically review and solve this special learning task in t...
متن کاملOnline Ensemble Learning for Imbalanced Data Streams
While both cost-sensitive learning and online learning have been studied extensively, the effort in simultaneously dealing with these two issues is limited. Aiming at this challenge task, a novel learning framework is proposed in this paper. The key idea is based on the fusion of online ensemble algorithms and the state of the art batch mode cost-sensitive bagging/boosting algorithms. Within th...
متن کاملRecognition of Multiple Imbalanced Cancer Types Based on DNA Microarray Data Using Ensemble Classifiers
DNA microarray technology can measure the activities of tens of thousands of genes simultaneously, which provides an efficient way to diagnose cancer at the molecular level. Although this strategy has attracted significant research attention, most studies neglect an important problem, namely, that most DNA microarray datasets are skewed, which causes traditional learning algorithms to produce i...
متن کاملFinancial Accounting Fraud Detection Using Business Intelligence
The paper investigates the inherent problems of financial fraud detection and proposes a forensic accounting framework using business intelligence as a plausible means of addressing them. The paper adopts an empirical case study approach to present how business intelligence could be used effectively in the detection of financial accounting fraud. The proposed forensic accounting framework using...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Accounting and finance
سال: 2023
ISSN: ['0810-5391', '1467-629X']
DOI: https://doi.org/10.1111/acfi.13044